ICEAGE: Interactive Clustering and Exploration of Large and High-Dimensional Geodata
نویسندگان
چکیده
The unprecedented large size and high dimensionality of existing geographic datasets make the complex patterns that potentially lurk in the data hard to ®nd. Clustering is one of the most important techniques for geographic knowledge discovery. However, existing clustering methods have two severe drawbacks for this purpose. First, spatial clustering methods focus on the speci®c characteristics of distributions in 2-or 3-D space, while general-purpose high-dimensional clustering methods have limited power in recognizing spatial patterns that involve neighbors. Second, clustering methods in general are not geared toward allowing the human-computer interaction needed to effectively tease-out complex patterns. In the current paper, an approach is proposed to open up thè`black box'' of the clustering process for easy understanding, steering, focusing and interpretation, and thus to support an effective exploration of large and high dimensional geographic data. The proposed approach involves building a hierarchical spatial cluster structure within the high-dimensional feature space, and using this combined space for discovering multi-dimensional (combined spatial and non-spatial) patterns with ef®cient computational clustering methods and highly interactive visualization techniques. More speci®cally, this includes the integration of: (1) a hierarchical spatial clustering method to generate a 1-D spatial cluster ordering that preserves the hierarchical cluster structure, and (2) a density-and grid-based technique to effectively support the interactive identi®cation of interesting subspaces and subsequent searching for clusters in each subspace. The implementation of the proposed approach is in a fully open and interactive manner supported by various visualization techniques.
منابع مشابه
High-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملCoordinating computational and visual approaches for interactive feature selection and multivariate clustering
Received: KK Revised: KK Accepted: KK Abstract Unknown (and unexpected) multivariate patterns lurking in high-dimensional datasets are often very hard to find. This paper describes a human-centered exploration environment, which incorporates a coordinated suite of computational and visualization methods to explore high-dimensional data for uncovering patterns in multivariate spaces. Specificall...
متن کاملSYMBIOTIC ORGANISMS SEARCH AND HARMONY SEARCH ALGORITHMS FOR DISCRETE OPTIMIZATION OF STRUCTURES
In this work, a new hybrid Symbiotic Organisms Search (SOS) algorithm introduced to design and optimize spatial and planar structures under structural constraints. The SOS algorithm is inspired by the interactive behavior between organisms to propagate in nature. But one of the disadvantages of the SOS algorithm is that due to its vast search space and a large number of organisms, it may trap i...
متن کاملA Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS
Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper...
متن کاملUnderstanding Hierarchical Clustering Results by Interactive Exploration of Dendrograms: A Case Study with Genomic Microarray Data
Hierarchical clustering is widely used to find patterns in multi-dimensional datasets, especially for genomic microarray data. Finding groups of genes with similar expression patterns can lead to better understanding of the functions of genes. Early software tools produced only printed results, while newer ones enabled some online exploration. We describe four general techniques that could be u...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- GeoInformatica
دوره 7 شماره
صفحات -
تاریخ انتشار 2003